vincent van gogh
- North America > United States > Virginia (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- North America > United States > Virginia (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework
Li, Feiran, Xu, Qianqian, Bao, Shilong, Yang, Zhiyong, Cao, Xiaochun, Huang, Qingming
Concept erasing has recently emerged as an effective paradigm to prevent text-to-image diffusion models from generating visually undesirable or even harmful content. However, current removal methods heavily rely on manually crafted text prompts, making it challenging to achieve a high erasure (efficacy) while minimizing the impact on other benign concepts (usability). In this paper, we attribute the limitations to the inherent gap between the text and image modalities, which makes it hard to transfer the intricately entangled concept knowledge from text prompts to the image generation process. To address this, we propose a novel solution by directly integrating visual supervision into the erasure process, introducing the first text-image Collaborative Concept Erasing (Co-Erasing) framework. Specifically, Co-Erasing describes the concept jointly by text prompts and the corresponding undesirable images induced by the prompts, and then reduces the generating probability of the target concept through negative guidance. This approach effectively bypasses the knowledge gap between text and image, significantly enhancing erasure efficacy. Additionally, we design a text-guided image concept refinement strategy that directs the model to focus on visual features most relevant to the specified text concept, minimizing disruption to other benign concepts. Finally, comprehensive experiments suggest that Co-Erasing outperforms state-of-the-art erasure approaches significantly with a better trade-off between efficacy and usability. Codes are available at https://github.com/Ferry-Li/Co-Erasing.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Texas > Loving County (0.04)
- North America > Canada (0.04)
- (2 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Boosting Alignment for Post-Unlearning Text-to-Image Generative Models
Ko, Myeongseob, Li, Henry, Wang, Zhun, Patsenker, Jonathan, Wang, Jiachen T., Li, Qinbin, Jin, Ming, Song, Dawn, Jia, Ruoxi
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data. However, this often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns. Driven by these concerns, machine unlearning has become crucial to effectively purge undesirable knowledge from models. While existing literature has studied various unlearning techniques, these often suffer from either poor unlearning quality or degradation in text-image alignment after unlearning, due to the competitive nature of these objectives. To address these challenges, we propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives. We further derive the characterization of such an update. In addition, we design procedures to strategically diversify the unlearning and remaining datasets to boost performance improvement. Our evaluation demonstrates that our method effectively removes target classes from recent diffusion-based generative models and concepts from stable diffusion models while maintaining close alignment with the models' original trained states, thus outperforming state-of-the-art baselines. Our code will be made available at \url{https://github.com/reds-lab/Restricted_gradient_diversity_unlearning.git}.
- North America > United States > Virginia (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion
Kim, Sanghyun, Jung, Seohyeon, Kim, Balhae, Choi, Moonseok, Shin, Jinwoo, Lee, Juho
Existing models rely heavily on internet-crawled data, wherein problematic concepts persist due to incomplete filtration processes. While previous approaches somewhat alleviate the issue, they often rely on text-specified concepts, introducing challenges in accurately capturing nuanced concepts and aligning model knowledge with human understandings. In response, we propose a framework named Human Feedback Inversion (HFI), where human feedback on model-generated images is condensed into textual tokens guiding the mitigation or removal of problematic images. The proposed framework can be built upon existing techniques for the same purpose, enhancing their alignment with human judgment. By doing so, we simplify the training objective with a self-distillation-based technique, providing a strong baseline for concept removal. Our experimental results demonstrate our framework significantly reduces objectionable content generation while preserving image quality, contributing to the ethical deployment of AI in the public sphere.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Japan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Editing Massive Concepts in Text-to-Image Diffusion Models
Xiong, Tianwei, Wu, Yue, Xie, Enze, Wu, Yue, Li, Zhenguo, Liu, Xihui
While previous methods have mitigated the issues on a small scale, it is essential to handle them simultaneously in larger-scale real-world scenarios. We propose a two-stage method, Editing Massive Concepts In Diffusion Models (EMCID). The first stage performs memory optimization for each individual concept with dual self-distillation from text alignment loss and diffusion noise prediction loss. The second stage conducts massive concept editing with multi-layer, closed form model editing. We further propose a comprehensive benchmark, named ImageNet Concept Editing Benchmark (ICEB), for evaluating massive concept editing for T2I models with two subtasks, free-form prompts, massive concept categories, and extensive evaluation metrics. Extensive experiments conducted on our proposed benchmark and previous benchmarks demonstrate the superior scalability of EMCID for editing up to 1,000 concepts, providing a practical approach for fast adjustment and re-deployment of T2I diffusion models in real-world applications.
- North America > United States (0.97)
- North America > Mexico (0.14)
- Europe > United Kingdom (0.14)
- (3 more...)
Art Authentication with Vision Transformers
Schaerf, Ludovica, Popovici, Carina, Postma, Eric
In recent years, Transformers, initially developed for language, have been successfully applied to visual tasks. Vision Transformers have been shown to push the state-of-the-art in a wide range of tasks, including image classification, object detection, and semantic segmentation. While ample research has shown promising results in art attribution and art authentication tasks using Convolutional Neural Networks, this paper examines if the superiority of Vision Transformers extends to art authentication, improving, thus, the reliability of computer-based authentication of artworks. Using a carefully compiled dataset of authentic paintings by Vincent van Gogh and two contrast datasets, we compare the art authentication performances of Swin Transformers with those of EfficientNet. Using a standard contrast set containing imitations and proxies (works by painters with styles closely related to van Gogh), we find that EfficientNet achieves the best performance overall. With a contrast set that only consists of imitations, we find the Swin Transformer to be superior to EfficientNet by achieving an authentication accuracy of over 85%. These results lead us to conclude that Vision Transformers represent a strong and promising contender in art authentication, particularly in enhancing the computer-based ability to detect artistic imitations.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
- Europe > Switzerland (0.04)
- Europe > Netherlands (0.04)
- (4 more...)
Could AI Ever Pass the Van Gogh Test?
That is, the Van Gogh Test for sheer creativity. This past Thursday night, Discovery Institute's tech summit COSM 2022 presented a live, in-person interview with Federico Faggin, the Italian physicist and computer engineer who co-won the prestigious Kyoto Prize in 1997 for helping develop the Intel 4004 chip. Faggin was interviewed by technology reporter Maria Teresa Cometto, who asked him to regale the audience with tales about helping to design early microchips. Eventually Faggin recounted a time when he was "studying neuroscience and biology, trying to understand how the brain works," and came upon a startling realization: And at one point I asked myself, "But wait a second, I mean these books, all this talk about electrical signals, biochemical signals, but when I taste some chocolate, I mean I have a taste. A computer, does it taste this? Does it have a sensation or a feeling for the signals that he has in his memory or in his CPU? So where are sensations and feelings coming from?" … And so I discovered what was later called the hard problem of consciousness.
David O. Houwen on LinkedIn: #AI #LLMs #OpenAI
Do not keep calm and carry on, girls!'' Do we really care more about Van Gogh's sunflowers than real ones? Gedorfge Monbiot The Guardian The response by the media and government to the two Just Stop Oil activists who threw soup at Vincent van Gogh's Sunflowers in the National Gallery in London speaks volumes. Decorating the glass protecting the painting with tomato soup (the painting itself was, as the protesters calculated, undamaged) appears to horrify some people more than the collapse of our planet, which these campaigners are seeking to prevent. Everywhere I see claims that the "extreme" tactics of environmental campaigners will prompt people to "stop listening". But how could we listen any less to the warnings of scientists and campaigners and eminent committees?
Creative Painting with Latent Diffusion Models
Artistic painting has achieved significant progress during recent years. Using an autoencoder to connect the original images with compressed latent spaces and a cross attention enhanced U-Net as the backbone of diffusion, latent diffusion models (LDMs) have achieved stable and high fertility image generation. In this paper, we focus on enhancing the creative painting ability of current LDMs in two directions, textual condition extension and model retraining with Wikiart dataset. Through textual condition extension, users' input prompts are expanded with rich contextual knowledge for deeper understanding and explaining the prompts. Wikiart dataset contains 80K famous artworks drawn during recent 400 years by more than 1,000 famous artists in rich styles and genres. Through the retraining, we are able to ask these artists to draw novel and creative painting on modern topics. Direct comparisons with the original model show that the creativity and artistry are enriched.
- North America > United States > Texas > Loving County (0.05)
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > China > Tibet Autonomous Region (0.05)